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Abstract 


We use the distance sum rule method to constrain the spatial curvature of the Universe with a large sample of 161 
strong gravitational lensing systems, whose distances are calibrated from the Pantheon compilation of type Ia 
supernovae using deep learning. To investigate the possible influence of mass model of the lens galaxy on 
constraining the curvature parameter Q, we consider three different lens models. Results show that a flat Universe 
is supported in the singular isothermal sphere (SIS) model with the parameter Q, = 0.049*0.127. While in the 
power-law (PL) model, a closed Universe is preferred at the ~3o0 confidence level, with the parameter 
Q; = —0.24570071. In the extended PL model, the 95% confidence level upper limit of 9, is «0.011. As for the 
parameters of the lens models, constraints on the three models indicate that the mass profile of the lens galaxy 
could not be simply described by the standard SIS model. 
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Constraining the Spatial Curvature of the Local Universe with Deep Learning 


1. Introduction 


The standard cosmological model (ACDM model), regarded 
as a cornerstone solution derived from the homogeneous and 
isotropic Friedmann—Robertson—Walker (FRW) metric, pre- 
sents a comprehensive framework in cosmology. This model 
postulates the existence of radiation, ordinary baryonic matter, 
non-luminous dark matter, and enigmatic dark energy as 
constituents of the Universe. Its validity and credibility find 
strong support from a plethora of cosmological observations 
(Ade et al. 2016; Aghanim et al. 2020). Especially, the most 
recent findings derived from the conclusive full-mission 
analysis of the cosmic microwave background (CMB) 
anisotropies by the Planck mission exhibit remarkable agree- 
ment with the prevailing spatially flat 6-parameter ACDM 
cosmological model. These results not only validate the 
standard framework but also provide stringent constraints on 
the cosmological parameters with an exceptional level of 
precision (Aghanim et al. 2020). However, some challenges 
come following the success of the ACDM model. Recently, the 
"Ho tension problem,” i.e., the measured value of Hubble 
constant Ho from the local type Ia supernovae (SNe Ia) 
observation is inconsistent with the result from the Planck 
observation of CMB, has attracted great attention (Freedman 
2017; Riess et al. 2016, 2019; Di Valentino et al. 2021). This 
discrepancy is possibly caused by either unknown systematic 
uncertainties or new physics beyond the standard ACDM 
cosmology. 

In the context of the FRW metric, the spatial curvature 
parameter plays a pivotal role in elucidating the geometric 
nature of the Universe. By incorporating measurements from 


the CMB and baryon acoustic oscillation (BAO), it has been 
established that the Universe can be reasonably modeled as 
spatially flat. This conclusion is supported by the constraint on 
the curvature parameter, 2, — 0.001 + 0.002 (Aghanim et al. 
2020). Considering the intricate degeneracy between the 
curvature parameter and the equation of state of dark energy, 
the assumption of a flat Universe is commonly adopted during 
the analysis of dark energy properties. Small deviation of 
spatial curvature from zero would generate enormous effects on 
the reconstruction of dark energy and on the evolution of the 
Universe (Ichikawa & Takahashi 2006; Clarkson et al. 2007; 
Gong & Wang 2007; Virey et al. 2008). Although the Planck 
CMB data constrain the spatial curvature at a very high 
precision, the predicted evolution not only depends on a certain 
cosmological model (the ACDM model) but also on the 
evolution of the early Universe. Recently, the reanalysis of 
Planck data showed that a closed Universe is favored against a 
flat Universe (Di Valentino et al. 2019, 2021). The presence of 
the so-called “Ho tension problem" suggests the possibility of 
deviations between the actual state of the Universe and the 
predictions of the standard ACDM model. Specifically, it 
implies that the cosmological parameters derived from CMB 
measurements may differ from those obtained through local 
data. Consequently, it becomes crucial to ascertain the spatial 
curvature of the local Universe in a manner that is independent 
of specific theoretical models. 

The measurement of spatial curvature is generally the by- 
product of the validity test of the FRW metric. A model- 
independent approach was introduced by Clarkson et al. 
(2007, 2008) to scrutinize the validity of the FRW metric. 
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This method involves a comparative analysis of the cosmic 
expansion rate and cosmological distance, and has since been 
widely employed to examine the FRW metric and impose 
constraints on the spatial curvature (Mortsell & Jonsson 2011; 
Sapone et al. 2014; Cai et al. 2016). Bernstein (2006) put forth 
an alternative model-independent geometric approach to 
constrain spatial curvature. This methodology revolves around 
the fundamental sum rule of distances along null geodesics 
within the FRW metric framework. Räsänen et al. (2015) 
employed the distance sum rule (DSR) to evaluate the accuracy 
of the FRW metric. By combining data from SNe Ia and strong 
gravitational lensing (SGL), they examined the validity of the 
FRW metric. Their analysis confirmed the overall validity of 
the FRW metric, although the obtained constraint on the spatial 
curvature parameter was relatively weak or loosely constrained. 
Considering the interdependence between the curvature para- 
meter and the parameters of the lensing model, Xia et al. (2017) 
adopted more intricate lensing models in their analysis and 
attained constraints on the spatial curvature by leveraging a 
substantial data set comprising 118 SGL systems (Cao et al. 
2015, 2016). Following this line, there were a series of works 
devoted to constraining the spatial curvature with updated 
observational data (Li et al. 2018; Qi et al. 2019; Liu et al. 
2020; Cao et al. 2022). It should be noted that the constraints 
on the spatial curvature derived from the aforementioned 
studies suffer from limitations arising from the relatively small 
size of the available SGL sample, as well as uncertainties 
stemming from unknown systematic effects. Moreover, the 
methods to calibrate the distances of lenses and sources within 
SGL systems rely on a polynomial approximation that is 
assumed to fit the SNe Ia sample. Further research and 
advancements in data acquisition and analysis techniques are 
necessary to address these limitations and improve the 
precision of spatial curvature measurements in cosmology. 

To alleviate the above shortcomings, Wang et al. (2020) 
made significant advancements in constraining the spatial 
curvature. They employed the DSR method and combined data 
from the Pantheon SN Ia compilation with a data set 
comprising 161 SGL systems. Notably, they avoided assump- 
tions regarding the parametric form of the distance-redshift 
relation of SNe Ia. Instead, they employed a Gaussian Process 
(GP) method to reconstruct the dimensionless comoving 
distance based on the Pantheon compilation. Without the prior 


of Ho, the constraints on spatial curvature are Q} = 0.571038 in 


the singular isothermal sphere (SIS) model, Q} = 42460175 
in the power-law (PL) model, and Q, = 0.25*0J$ in the 
extended power-law (EPL) model. Previous studies have 
indicated that a larger data set is beneficial to achieve a tighter 
constraint on €), (Xia et al. 2017; Li et al. 2018; Qi et al. 2019). 
However, the GP method is unable to extrapolate the curve 
beyond the available data region, and its accuracy diminishes 
significantly in regions where data points are sparse. 
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Consequently, in their analysis, only the SGL systems with 
redshifts lower than the maximum redshift of the SNe Ia data 
could be utilized. This constraint resulted in a reduction in the 
number of available SGL systems from the initial 161—135. 
Therefore, although the constraint of Wang et al. (2020) is 
tighter than previous works, the method to reconstruct the 
distance-redshift relation can be further improved so that all 
SGL systems can be used to constrain the spatial curvature. 
In this paper, we will maintain the advantages of Wang et al. 
(2020), i.e., the large data set and model-independence, and 
employ a deep learning method to reconstruct the distance- 
redshift relation based on the Pantheon data set, extending it up 
to the maximum redshift of the available SGL systems. Deep 
learning is a realm dedicated to the research of various Artificial 
Neural Networks (ANNs), which are composed of layers of 
neurons modeled after the biological neurons in a human brain. 
Hence, deep learning is fantastic to deal with large and highly 
complex tasks, such as classification, clustering, generation and 
so on. Deep learning has emerged as a powerful tool in various 
cosmological research areas, demonstrating its effectiveness in 
tasks such as predicting galaxy morphology (Dieleman et al. 
2015), constraining dark energy (Escamilla-Rivera et al. 2019), 
and calibrating gamma-ray bursts (GRBs) (Luongo & Muccino 
2021; Tang et al. 2021b). In our recent work (Tang et al. 202 1a), 
we applied deep learning techniques to reconstruct the distance- 
redshift relation of SNe Ia without making any assumptions 
about the cosmological model or the parametric form of the 
relation. Furthermore, we utilized this reconstructed relation to 
investigate potential redshift dependencies in the luminosity 
corrections of GRBs. Unlike the GP method, which is 
constrained to reconstruct the curve within the data region, deep 
learning has the capacity to extend the reconstruction far beyond 
the available data region. Thus all of the SGL systems can be 
used and the constraint on the spatial curvature would be tighter. 
The structure of the remaining sections of this paper is as 
follows: Section 2 provides an overview of the DSR method 
and the lens mass models utilized in constraining the spatial 
curvature. Section 3 outlines the observational data sets 
employed in the analysis and details of the procedure for 
reconstructing the distance-redshift relation using deep learning 
techniques. The obtained results are presented in Section 4. 
Lastly, Section 5 contains the discussion and summary. 


2. Methodology 


In the context of a homogeneous and isotropic Universe, the 
spacetime can be described by the FRW metric, given by 


2 
ds? = —c?dt? 4 20 dr? + a(t} rd, 1 
TENET, (t) (1) 


where c represents the speed of light, and K is a constant that 
denotes the spatial curvature of the Universe. Specifically, 
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when K <0, K — 0, and K > 0, it corresponds to an open, flat, 
and closed Universe, respectively. The scale factor a(t) 
represents the expansion of the Universe with respect to 
cosmic time, and its derivative à = z defines the Hubble 


parameter H = 5. To quantify the spatial separation between a 
source at redshift z, observed from redshift z;, the dimension- 
less comoving distance is expressed as 


dlen z) = qp Pe a ze) Q) 


where Q = -5> represents the normalized curvature 
5.49 

parameter. The reduced Hubble parameter is denoted as 

E@= a where Hp represents the present-day value of 


the Hubble parameter. The function S, is defined as follows 


sinh(x), (Q; > 0), 
Sk(x) = 4x, (Q; = 0), (3) 
sin(x), (Q; < 0). 


For simplicity, we introduce the notation d(z)=d(0, z), 
dj=dO, z), d;=d(0, z,), and d,,=d(z, zę). Under the 
assumption that cosmic time £ and redshift z have a one-to- 
one correspondence, and with the condition that the derivative 
of d(z) with respect to z satisfies d'(z) > 0, the three- 
dimensionless distances (d;, d,, and d;,) are connected through 
the DSR relation (Räsänen et al. 2015) 


T = J1 + Od? “fi + Qyd?. (4) 


If the Universe is accurately described by the FRW metric, the 
curvature parameter €, should be a constant. Therefore, if the 
validity of the FRW metric is confirmed, the DSR relation 
provides a means to constrain the value of Q}. By analyzing the 
relation between the dimensionless distances, we can obtain 
valuable insights into the spatial curvature of the Universe. 
The dimensionless comoving distances d; and d, can be 
obtained through the analysis of SN Ia data. On the other hand, 
the distance ratio dj,/d, is determined using the data from SGL 
observations. In terms of the Einstein radius and the velocity 
dispersion associated with the lens mass profile, the expression 
for the distance ratio can be formulated. For certain gravitational 
lens systems, the mass distribution of the lens has been observed 
to closely approximate an isothermal profile (Cohn et al. 2001; 
Munoz et al. 2001; Rusin et al. 2002; Treu & Koopmans 2002; 
Rusin & Kochanek 2005). Consequently, the SIS model has 
emerged as a prevalent and straightforward choice for describing 
the lens mass profile. This model effectively emulates the flat 
rotation curves characteristic of galaxies, featuring a density 
inversely proportional to the square of the galaxy's radius. 
Additionally, the structure of galaxies has been extensively 
explored through N-body simulations (Navarro et al. 1996; 
Moore et al. 1998). Navarro et al. (Navarro et al. 1996) 
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discovered that the halo profile of a galaxy exhibits an 
approximate isothermal behavior across a wide range of radii. 
However, it deviates from the r ? PL near the central region, 
transitioning to a steeper profile than r° as the distance 
approaches the virial radius. To adequately address the specific 
features of the halo profile, a density function incorporating a PL 
dependence on radius has been introduced to describe the lens 
profile. This density function resembles the generalized 
Navarro—Frenk—White (NFW) profile (Navarro et al. 1996). 
Among these descriptions, the PL model stands out, character- 
ized by a variable PL index denoted as y. Remarkably, when 7+ is 
set to 2, the profile aligns with a configuration akin to an SIS. It 
is noteworthy that these profile descriptions do not inherently 
distinguish between luminosity density and total mass density. 
When accounting for the presence of dark matter, the luminosity 
density may diverge from the overall galaxy profile. This 
prompts the introduction of models like the EPL model, which 
accommodates the complexities stemming from both luminosity 
density and dark matter distribution within the lens mass. Hence, 
we consider three distinct lens models: the SIS model, the PL 
model, and the EPL model, to comprehensively investigate the 
influence of various lens galaxy mass profiles on the constraints 
imposed on the spatial curvature. 

Within the SIS model, the mass density distribution of the 
lens galaxy follows a scaling relation of p cx r ?. This leads to 
an expression for the distance ratio as follows (Mollerach & 
Roulet 2002) 
dis I COE 


3 
d; ATO sis 


(5) 


where 6; represents the Einstein radius, and os is the velocity 
dispersion associated with the lens mass profile. It is worth 
noting that the equivalence between the observed stellar 
velocity dispersion c, and cesis within the context of the SIS 
model is not an absolute requirement (Khedekar & Chakraborti 
2011). There may be potential deviations between the observed 
stellar velocity dispersion and the characteristic velocity 
dispersion associated with the SIS model. Consequently, to 
account for such deviation, a phenomenological parameter f is 
introduced, yielding oss = fog (Kochanek 1992; Ofek et al. 
2003; Cao et al. 2012; Li et al. 2019). Notably, the free 
parameter f is anticipated to fall within the range of 
0.8 «f? — 1.2 (Ofek et al. 2003). In practice, the velocity 
dispersion is typically measured within the aperture radius 6, 
in actual SGL data. To convert the measurement to oo, an 
aperture correction formula (Jorgensen et al. 1995) can be 
employed, given by the equation 


n 
a 2 zs | (6 


where Cap represents the luminosity weighted average of the 
line-of-sight velocity dispersion within the aperture radius, Oeff 
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corresponds to the effective angular radius, and 7 is the 
correction factor fixed to —0.066 (Cappellari et al. 2006; Chen 
et al. 2019). It is important to consider that the uncertainty 
associated with Cap propagates to vo, subsequently impacting 
sis. Additionally, the uncertainty in the distance ratio d),/d, is 
derived from the uncertainties in 0g and oss. In this work, we 
adopt a fractional uncertainty of 5% for 0g (Liao et al. 2016). 

Within the framework of the PL spherical model, the mass 
density distribution of the lensing galaxy is characterized by a 
spherically symmetric PL behavior, expressed as peor ”, 
where ^ represents the PL index. The distance ratio in the PL 
model can be described as (Koopmans et al. 2006) 


+ = Ze (e) F'O), (7) 
where 
fQ) 
ld 200 -A Tyi ES up. 
Æ 3-34  TG-3/2| TOA 


(8) 


It is worth noting that when y takes the value of 2, the 
PL model reduces to the standard SIS model. To account 
for the potential redshift evolution of the mass density 
profile, we introduce a parameterization for y, expressed as 
(Zz) = Yo + 51zj, Where yo and yı represent two independent 
free parameters. 

Within the EPL model, the luminosity density profile v(r) 
can differ from the total mass density profile p(r), accounting 
for the presence of a dark matter halo. We adopt the following 
functional forms for the PL mass density profile and the 
luminosity density of stars, 


—a —ó 
po = a[( vi =v =] (9) 
ro ro 


where œ and 6 correspond to the PL index parameters, ro 
represents the characteristic length scale, and po and Vv are 
normalization constants. The distance ratio in the EPL model is 
expressed as (Birrer et al. 2019; Lee 2021) 


dis I c^ 0g 3—6 (£ B 
d, 2e$4m- (£ — 20)G — ON be 
: |: = BME + »| 


1 
A(a) A(6) P 


where =a 4-6—2, A(x) = (5 *\/T(2), and Ó represents 
an anisotropy parameter that characterizes the anisotropic 
distribution of the three-dimensional velocity dispersion. In 
accordance with Wang et al. (2020), we consider ĝ as a 
nuisance parameter and marginalize over it with a Gaussian 


prior of  — 0.18 + 0.13. Simultaneously, we treat œ and 6 as 
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free parameters. It is worth noting that when a= ó — 2 and 
B — 0, the EPL model reduces to the standard SIS model. 


3. Observational Data and Deep Learning 


The distance ratios d;,/d, are obtained from the observations 
of SGL systems. In a recent study, Chen et al. (2019) compiled 
a new SGL sample by combining data from various galaxy 
surveys, including the Lenses Structure and Dynamics (LSD) 
survey (Treu & Koopmans 2004), the Sloan Lens ACS 
(SLACS) survey (Bolton et al. 2006), the CFHT Strong 
Lensing Legacy Survey (Cabanac et al. 2008), and the BOSS 
Emission-Line Lens Survey (Brownstein et al. 2012). This 
compiled sample consists of 161 galaxy-scale SGL systems, 
covering a redshift range of z; € [0.0624, 1.004] for the lens 
galaxies and z, c [0.197, 3.595] for the source galaxies. In 
Figure 1, we illustrate the distribution of the SGL sample, 
derived from diverse survey sources, as depicted in the z,-z, 
plane. Additionally, we provide the redshift distribution of the 
lens objects, with a predominant concentration of lenses 
residing at an approximate redshift of z; ~ 0.2. 

The dimensionless comoving distances d; and d, are derived 
from the luminosity distance Dz of SNe Ia using the relation 


_ ApDz(z) 


d(z) = ! 
© c(1 +z) 


(11) 
The luminosity distance D; can be obtained from the light 
curve of SNe Ia. Considering a specific redshift z, the distance 
modulus of SNe Ia can be expressed as 


u = Slog DL) 1 95 = MB,corr — Mp, (12) 
Mpc 


where Mg represents the absolute magnitude and mgcorr 
denotes the corrected apparent magnitude observed in the B- 
band, reported in the largest and most recent Pantheon data set 
(Scolnic et al. 2018). The redshift range of the SN Ia sample 
used in our work, i.e., the Pantheon data set, is z € [0.01, 2.30]. 

To obtain the comoving distances d at the redshifts of the 
lens and source for all the SGL systems, it is necessary to 
reconstruct a continuous curve of the distance-redshift relation 
d(z) based on the Pantheon sample. Previous work by Wang 
et al. (2020) employed the GP method to reconstruct a smooth 
curve of d(z) from SN Ia data. However, the reconstructed 
uncertainty of the GP method tends to be large in regions where 
the data points are sparse, and it becomes even more 
challenging to estimate distances beyond the observed redshift 
range. Consequently, SGL systems with source redshifts larger 
than 2.3 could not be utilized in their analysis. 

In this paper, we adopt a deep learning method to reconstruct 
the distance-redshift curve without any specific assumption 
about its parametric form. This approach allows us to 
reconstruct the distance-redshift relation using a wide range 
of redshifts, covering the entire redshift range of the SGL 
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sample. Specifically, we can extend the reconstruction up to a 
redshift of z — 4, thus ensuring that we encompass the full 
redshift range of the SGL systems under consideration. This 
utilization of deep learning enables us to overcome the 
limitations associated with the sparse data points and extra- 
polate the distance-redshift relation to regions beyond the direct 
observational range. 

Deep learning has emerged as a powerful methodology for 
analyzing complex and intricate data sets. One common 
approach involves the utilization of ANNs as underlying 
models, such as Convolutional Neural Networks (CNNs), 
Recurrent Neural Networks (RNNs), and Bayesian Neural 
Networks (BNNs), among others. These neural networks 
typically consist of multiple layers of interconnected processing 
units, where each layer receives information from the previous 
layer, transforms it and then propagates it to the subsequent 
layer. Through training, these networks aim to learn and 
represent the underlying patterns and structures within the data. 
In the context of our research, we employ RNNs as a key 
component of our deep learning approach. RNNs are well- 
suited for handling sequential data and making predictions 
based on learned data representations. By feeding the Pantheon 
data set into the RNN, we can effectively capture the 
relationship between the distance modulus p and the redshift 
z. This enables us to predict distances at arbitrary redshifts, 
even beyond the range covered by the observational data. 
However, RNNs alone are insufficient for providing uncer- 
tainty estimates for these predictions. To address this limitation, 
we incorporate BNNs into our network architecture. BNNs 
serve as a complementary component to the RNNs and allow 
us to calculate the uncertainty associated with the distance 
predictions. Our previous work (Tang et al. 2021a) had 
incorporated both RNNs and BNNs to model the distance 
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modulus-redshift relationship based on the Pantheon data set, 
while this current research emphasizes the reconstruction of the 
distance curve d(z) using the deep learning approach. 

The architecture of our network is illustrated in Figure 2. The 
central component is the RNN, which consists of three layers: 
an input layer that receives the redshift z as the feature, a 
hidden layer that processes information from the previous layer 
and passes it to the next layer, and an output layer that 
generates the target output, which in this case is the comoving 
distance d. The RNN is designed to capture the temporal 
dependencies and patterns in the input data. To overcome the 
challenges associated with training RNNs on long sequential 
data and to address the issue of information retention over long 
periods, we employ Long Short-Term Memory (LSTM) cells 
as the basic units of our network. LSTM cells enhance RNNs 
by incorporating explicit memory mechanisms, allowing the 
network to selectively store, discard, and retrieve information. 
The input and hidden layers of our network consist of 100 
LSTM cells each. 

In the training process, the RNN is fed with the Pantheon 
data to learn and represent the relationship between the 
comoving distance d and the redshift z. This is achieved by 
minimizing a loss function that quantifies the discrepancy 
between the network's predictions and the observed distances. 
In this work, we utilize the mean-squared-error (MSE) function 
as the loss function, and we employ the Adam optimizer to find 
the minimum of this function. To enhance the network's 
performance, we introduce a non-linear activation function 
denoted as A; In our previous research on reconstructing the 
distance modulus &(z) we found that the hyperbolic tangent 
(tanh) function outperformed other activation functions such as 
ReLU, ELU, and SELU. However, since we are now 
reconstructing the comoving distance d(z) instead of the 


50 


404 


304 


Counts 


204 


104 


0.0 01 02 03 04 05 06 07 0.8 09 1.0 1.1 
Zi 


Figure 1. Left: The distribution of the SGL sample obtained from various surveys in the zz, plane. Right: the redshift distribution of lenses. 
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Figure 2. Left: The network architecture comprising a single hidden layer is illustrated. Right: The network unfolded up to time step t= 4, denoted as zò 
representing the ith time step. In our network, both the input layer and hidden layer are composed of LSTM cells, housing 100 neurons each. The output layer is a fully 
connected (dense) layer. To mitigate overfitting, the dropout technique is implemented between each LSTM cell and its subsequent layer. 


distance modulus, we compare the performance of all four 
activation functions (tanh, ReLU, ELU, and SELU) to 
determine the most suitable choice. By setting the time step 
to t=4 and utilizing the LSTM-based RNN architecture 
with appropriate activation functions, our network aims to learn 
the complex relationship between the redshift z and the 
comoving distance d. Through training and optimization, we 
obtain a model that can predict distances at arbitrary redshifts, 
including those beyond the range covered by the Pantheon 
data set. 

In the context of BNN, it is worth noting that designing a 
traditional BNN is a challenging task due to its inherent 
complexity. Fortunately, Gal & Ghahramani (2016a, 2016b, 
2016c) had demonstrated that dropout, commonly used in deep 
neural networks as a regularization technique to address the 
issue of overfitting, can be viewed as an approximation to 
Bayesian inference in deep GPs. This means that a network 
incorporating dropout can be considered mathematically 
equivalent to a Bayesian model. In this study, we incorporate 
the dropout technique within the RNN to emulate the 
characteristics of a BNN. By executing the trained network 
multiple times, we can generate multiple predictions for the 


comoving distance at different redshifts. This process allows us 
to obtain a range of possible predictions and, consequently, 
estimate the confidence region associated with these predic- 
tions. This approach effectively mimics the behavior of a BNN, 
where the network models the posterior distribution over the 
parameters. In our research, we employ a dropout rate of 0.2 
between the LSTM layer and its subsequent layer. 

To begin the reconstruction of the comoving distance d(z), 
we first normalize the comoving distance data obtained from 
the Pantheon compilation according to Equations (11) and (12) 
with the chosen parameters Họ=70 km E Mpc! and 
Mg- —19.36 (Scolnic et al. 2014). Next, we sort the 
normalized data points (z; d;) in ascending order of redshift 
z; and reorganize them into four sequences. In each sequence, 
the redshifts and the corresponding normalized distances are 
used as input and output vectors, respectively, for training the 
network. Subsequently, we train the network constructed as 
described above using TensorFlow^ for a total of 1000 
iterations. The well-trained network is saved for later use. In 
the final step, we execute the trained network 1000 times to 


2 https:/ /www.tensorflow.org 
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Figure 3. The reconstructions of d(z) from the Pantheon data are presented, employing four distinct activation functions. Top-left: tanh; top-right: ReLU; bottom-left: 


ELU; bottom-right: SELU. 


predict the distance d over the redshift range z € [0, 4]. The 
distribution of the predicted distances is obtained as a Gaussian 
distribution. 

The results of the distance reconstruction using the four 
activation functions (tanh, ReLU, ELU, and SELU) are plotted 
in Figure 3. For comparison, we also include the best-fitting 
curve of the ACDM model (represented by the black line). It is 
worth noting that while the uncertainty in the reconstructed 
curve using deep learning may be slightly larger than that 
obtained using the GP method within the data region, the 
advantage of deep learning lies in its ability to reconstruct the 
curve beyond the data region. This enables us to leverage the 
full sample of SGL systems. 

As depicted in the results, it is observed that only the 
reconstructions using the tanh and SELU activation functions 
are consistent with the flat ACDM model within the lo 
confidence level. Considering that most of the current 
cosmological probes favor the ACDM model, and the 
reconstructed curves using ReLU and ELU functions deviate 


from the ACDM model too much at high redshift, these two 
activation functions are excluded in the following calculation. 
Therefore, we will derive the dimensionless comoving 
distances of the SGL systems from the reconstruction with 
the tanh and SELU functions. We emphasize that the 
reconstructed curves using deep learning are independent of 
cosmological model. The ACDM curves plotted in Figure 3 are 
just for comparison. 


4. Results 


With the reconstructed d(z) curve, we can obtain the 
dimensionless comoving distance and the corresponding 
uncertainty at z; and z, then calculate the distance ratio 
Rsne = dj,/d, according to Equation (4), and the uncertainty 
OR, propagates from the uncertainties of d; and d,. In addition, 
the distance ratios Rscj, = d;,/d, can also be obtained from 
SGL systems using Equations (5), (7) or (10), according to 
different mass models of a lens galaxy. The corresponding 
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Table 1 
The Best-fitting Parameters in the Framework of SIS model Using the Distance 
Reconstructed with Tanh and SELU Functions 


% f 
tanh 0.049*047 1.038* 0008 
SELU 0.082023 1.039* 0005 


uncertainty og,, propagates from the uncertainties of the 
observations of SGL. To compare the distance ratio obtained 
from SNe Ia and SGL systems, we determine the best-fitting 
parameters by maximizing the likelihood function, which is 
proportional to the exponential of the negative chi-square 
statistic, i.e., £ ox exp( —X7/2), where 


161 2 
jus — R 
x^(p, ngay Bane — sa (13) 


i=1 O total 


Here, p represents the set of parameters for the lens mass 
profile, where p = f for the SIS model, p = (^o, 1) for the PL 
model, and p = (a, 6) for the EPL model. The term oiu 
represents the total uncertainty, which includes contributions 
from the uncertainty in the reconstruction and the uncertainty 
propagated from the SGL observations 


EE, 2 
Tiotal = T Rene + C Rsgi- (14) 


Assuming a flat prior on all free parameters, we calculate the 
posterior Probability Density Function (PDF) of the parameter 
space using the Python package emcee (Foreman-Mackey et al. 
2013). It is worth noting that the prior on the spatial curvature 
parameter Q; is set to Q; > — 0.39 to ensure that 1 + Qd? 20 
and 1 + O,d? > 0 are within the redshift range z < 4. 

In the context of the SIS lens model, we present the best- 
fitting parameters obtained using the reconstructions with the 
tanh and SELU activation functions in Table 1. Additionally, 
we provide the 1c and 2c confidence contours as well as the 
marginalized PDFs for the parameter space in Figure 4. For the 


spatial curvature, it is constrained to be Q} = 0.049707 with 


tanh function and (), = 0.082*025 with SELU function. The 
constraints on Q% in both of the two functions support a flat 
Universe within the 1c confidence level, consistent with the 
Planck results (Aghanim et al. 2020). The constraint on the 
parameter f is rather tight, 1.038000» with the tanh function 
and 14039095 with the SELU function. Both of them exclude 
the standard SIS model (f= 1) at more than the 4c confidence 
level. This indicates that the lens mass profile slightly but with 
strong evidence deviates from the standard SIS model. 

In the context of the PL lens model, the parameters are 
presented in Table 2. Additionally, the contours and PDFs for 
the parameter space are plotted in Figure 5. Similar to the SIS 
model, the constraints with tanh and SELU activation functions 
are consistent with each other at the lo confidence level. 
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Figure 4. The two-dimensional confidence contours and one-dimensional 
PDFs for the parameters within the SIS model framework are depicted. The 
results obtained using the distance reconstructed with the tanh and SELU are 
represented by the red and blue lines, respectively. 


Table 2 
The Best-fitting Parameters in the Framework of PL Model Using the Distance 
Reconstructed with Tanh and SELU Functions 


Qk Yo ^ 
tanh —0.245*0071 2.076500: —0.309* 0.51 
SELU —0.232* 0078 2.074*0028 —0.307*0123 


However, the constraint on curvature parameter in the PL 
model is totally different from that in the SIS model. The 


spatial curvature is constrained to be Q% = —0.245+0:075 with 
tanh function and Q = —0.232*0076 with SELU function. The 


constraints on Q; in the PL model prefer a closed Universe at 
the ~3o confidence level. For the lens parameters, they are 
constrained to be (4, ^j) = (2.076*0055. —0,309 01155 with 
the tanh function, and (^y, 6) = (2.07470 085. 20,307 ER) 
with the SELU function. The results deviate from the standard 
SIS model (59 — 2, yı — 0) at more than the 2c confidence 
level, demonstrating that the total mass density profile of the 
lens galaxy possibly evolves with cosmic time. 

In the context of the EPL lens model, the results obtained 
with two activation functions are shown in Table 3 and 
Figure 6. The constraint in the EPL model is looser than that in 
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Figure 5. The two-dimensional confidence contours and one-dimensional 
PDFs for the parameters within the PL model framework are depicted. The 
results obtained using the distance reconstructed with the tanh and SELU are 
represented by the red and blue lines, respectively. 


the SIS and PL models. With two functions, the spatial 
curvature parameters are constrained to be Q< 0.011 in the 
tanh function and €), « 0.051 in the SELU function at the 95% 
confidence level. For the set of lens parameters, the results 
obtained with two activation functions are consistent with each 
other. We obtain (a, 6) = (2.1147001$, 2.383025) with the 
tanh function and (o, 6) = (2.112*0017, 2.375*0122) with the 
SELU function. Both results deviate from the SIS model 
(a; — 6 — 2) at more than the lo confidence level. Especially 
for a, it rules out œ — 2 at approximately the 30 confidence 
level. These results affirm that the influence of dark matter in 
early-type galaxies should be considered and the total-mass 
profiles are not necessarily consistent with the luminosity 
profiles. 

For enhanced clarity, Figure 7 showcases the optimal fitting 
outcomes concerning the curvature parameter (),, accompa- 
nied by their corresponding lo uncertainties within the 
context of the SIS and PL models. Additionally, the upper 
and lower bounds of €, within the EPL model are exhibited. 
Furthermore, to facilitate comprehensive comparison, we 
integrate the constraints on Q% originating from alternative 
cosmological methodologies, including outcomes from the 
Planck (Aghanim et al. 2020) and extended Baryon Oscilla- 
tion Spectroscopic Survey (eBOSS) (Alam et al. 2021). Upon 
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Figure 6. The two-dimensional confidence contours and one-dimensional 
PDFs for the parameters within the EPL model framework are depicted. The 
results obtained using the distance reconstructed with the tanh and SELU are 
represented by the red and blue lines, respectively. 


Table 3 
The Best-fitting Parameters in the Framework of EPL Model Using the 
Distance Reconstructed with Tanh and SELU Functions, the Constraints of Q% 
are Shown with the 9596 Confidence Level Upper Limits 


Or a ó 
tanh «0.011 3,1142001 2.383205 
SELU <0.051 2.112*0017 2.315 0 089 


meticulous scrutiny, it becomes conspicuous that the con- 
straints derived from the PL model exhibit noteworthy 
deviations from the outcomes of the Planck and eBOSS 
investigations. Meanwhile, the results stemming from the SIS 
and EPL models manifest congruence with the findings of the 
Planck and eBOSS initiatives. It is worth noting that the 
constraints associated with the EPL model, while consistent, 
display a marginally reduced stringency. Our findings under- 
score that, if the Universe is indeed flat, subtle deviations 
from the isothermal profile are discernible within the lens 
distribution. Moreover, it is imperative to duly consider 
variables such as the redshift evolution of lens profiles and the 
intricate interplay of dark matter in the broader landscape of 
cosmological research. 
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Figure 7. Constraint results of Q; in three different lens models using two activation functions in our work, compared with the constraints from other cosmological 


probes, Planck and eBOSS. 


5. Discussion and Summary 


Based on geometrical optics, the DSR offers a model- 
independent approach to testing the validity of the FRW metric 
in cosmology. The DSR has proven to be a valuable tool for 
constraining the spatial curvature of the Universe. Applying the 
DSR method, Wang et al. (2020) recently investigated the 
spatial curvature with the combination of an SGL sample and 
the latest Pantheon SNe Ia. Although the total number of SGL 
systems is 161, the available SGL systems in Wang et al. 
(2020) are just 135 due to the GP regression used to reconstruct 
the distance-redshift relation being unable to reconstruct the 
curve well beyond the data region. In this research, we use the 
same data samples but with a deep learning method to constrain 
the spatial curvature. In contrast to the GP method, deep 
learning exhibits enhanced capability in effectively reconstruct- 
ing data beyond the observed range. Hence, we can make use 
of the full SGL systems and improve the precision of the 
constraints. 

In this study, we developed a combined RNN and BNN 
architecture to accurately reconstruct the distance-redshift 
relation using the Pantheon sample. The RNN component of 
the network is specifically designed to predict the comoving 
distance at a given redshift, while the BNN component serves 
as a valuable complement, allowing for the calculation of 
uncertainties associated with these predictions. In the process 
of the distance reconstruction, we considered four activation 
functions and found that only the tanh and SELU functions can 
reproduce the Pantheon data well. Hence, we calibrated the 
distance of SGL systems with tanh and SELU functions. To 
investigate the possible influence of different lens models on 
constraining the spatial curvature, we considered three types of 
lens models, i.e., the SIS model, PL model and EPL model. In 
the SIS model, the spatial curvature is constrained to be 
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Qj, = 0.04970127 with the tanh function, and Q = 0.082012 
with the SELU function. Comparing with the result of Wang 
et al. (2020), Q, = 0.57039. which favors an open Universe at 
2c, our result favors a flat Universe with a higher accuracy due 
to the increase of available SGL data points. In the PL model, a 
closed Universe is favored, with the curvature parameter 
OQ, = —0.245'007 with the tanh function, and 
Q, = —0.232'0076 with the SELU function, which are 
consistent with Q; = —0.246* 0478 obtained in Wang et al. 
(2020). In the EPL model, the spatial curvature is constrained 
to be Q< 0.011 with the tanh function, and Q, < 0.051 with 
the SELU function. Comparing with the results 
OQ, = 0.250706 in Wang et al. (2020), our constraint on the 
spatial curvature parameter is looser in the EPL model, but 
there is no strong evidence ruling out a flat Universe. On the 
other hand, for the set of parameters in three lens models, the 
results demonstrate that the lens galaxies cannot be simply 
described by the standard SIS model. 

In summary, the lens mass models have a noticeable 
influence on the curvature parameter. In the SIS model, a 
spatially flat Universe is favored within 1c uncertainty. In the 
PL model, a closed Universe is favored at the —3c confidence 
level. In the EPL model, the constraint is relatively loose, but a 
flat Universe could not be excluded. More accurate modeling of 
the lens mass profile is necessary to further improve the 
constraint on the curvature parameter. 
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